-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2/n] Lightweight Ray AIR API refactor #37123
Merged
pcmoritz
merged 80 commits into
ray-project:master
from
pcmoritz:lightweight-ray-air-api-refactor-examples
Aug 2, 2023
Merged
[2/n] Lightweight Ray AIR API refactor #37123
pcmoritz
merged 80 commits into
ray-project:master
from
pcmoritz:lightweight-ray-air-api-refactor-examples
Aug 2, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pcmoritz
requested review from
richardliaw,
gjoliver,
krfricke,
xwjiang2010,
amogkam and
matthewdeng
as code owners
July 5, 2023 22:04
matthewdeng
approved these changes
Aug 2, 2023
@@ -701,7 +700,7 @@ def train_func(config): | |||
# Instantiate new Trainer in Trainable. | |||
trainer = trainer_cls(**config) | |||
|
|||
# Get the checkpoint from the Tune session, and use it to initialize | |||
# Get the checkpoint from the train context, and use it to initialize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be updating the code (under this comment) to use the new API as well? Or is that for a separate PR?
@@ -222,15 +222,15 @@ def __init__( | |||
raise ValueError( | |||
"'checkpoint_at_end' cannot be used with a function trainable. " | |||
"You should include one last call to " | |||
"`ray.air.session.report(metrics=..., checkpoint=...)` at the end " | |||
"`ray.train.session.report(metrics=..., checkpoint=...)` at the end " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested change
"`ray.train.session.report(metrics=..., checkpoint=...)` at the end " | |
"`ray.train.report(metrics=..., checkpoint=...)` at the end " |
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
8 tasks
NripeshN
pushed a commit
to NripeshN/ray
that referenced
this pull request
Aug 15, 2023
This PR migrates all the train and tune examples and docstrings to the new API convention, see https://github.com/ray-project/enhancements/ Continuation of ray-project#36706 and ray-project#37906 Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Signed-off-by: NripeshN <nn2012@hw.ac.uk>
NripeshN
pushed a commit
to NripeshN/ray
that referenced
this pull request
Aug 15, 2023
Continuation of ray-project#37123 Signed-off-by: NripeshN <nn2012@hw.ac.uk>
harborn
pushed a commit
to harborn/ray
that referenced
this pull request
Aug 17, 2023
This PR removes some circularities in the Ray AIR import system so we can put the training related functions into `ray.train`. It introduces a training context and makes report, get_dataset_shard, Checkpoint, Result, and the following configs: - CheckpointConfig - DataConfig - FailureConfig - RunConfig - ScalingConfig available in `ray.train`. No user facing changes yet, the old APIs still work. Going forward, it will be most consistent / symmetrical if these things are included in the following way: ```python from ray import train, tune, serve # Pick the subset that is needed # Include what you need from the following: from ray.train import CheckpointConfig, DataConfig, FailureConfig, RunConfig, ScalingConfig # ... def train_func(): dataset_shard = train.get_dataset_shard("train") world_size = train.get_context().get_world_size() # ... train.report(...) trainer = train.torch.TorchTrainer( train_func, scaling_config=ScalingConfig(num_workers=2), ) result = trainer.fit() ``` We have many examples in ray-project#37123 on how this looks like in actual code. Signed-off-by: harborn <gangsheng.wu@intel.com>
harborn
pushed a commit
to harborn/ray
that referenced
this pull request
Aug 17, 2023
This PR migrates all the train and tune examples and docstrings to the new API convention, see https://github.com/ray-project/enhancements/ Continuation of ray-project#36706 and ray-project#37906 Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Signed-off-by: harborn <gangsheng.wu@intel.com>
harborn
pushed a commit
to harborn/ray
that referenced
this pull request
Aug 17, 2023
Continuation of ray-project#37123 Signed-off-by: harborn <gangsheng.wu@intel.com>
harborn
pushed a commit
to harborn/ray
that referenced
this pull request
Aug 17, 2023
This PR removes some circularities in the Ray AIR import system so we can put the training related functions into `ray.train`. It introduces a training context and makes report, get_dataset_shard, Checkpoint, Result, and the following configs: - CheckpointConfig - DataConfig - FailureConfig - RunConfig - ScalingConfig available in `ray.train`. No user facing changes yet, the old APIs still work. Going forward, it will be most consistent / symmetrical if these things are included in the following way: ```python from ray import train, tune, serve # Pick the subset that is needed # Include what you need from the following: from ray.train import CheckpointConfig, DataConfig, FailureConfig, RunConfig, ScalingConfig # ... def train_func(): dataset_shard = train.get_dataset_shard("train") world_size = train.get_context().get_world_size() # ... train.report(...) trainer = train.torch.TorchTrainer( train_func, scaling_config=ScalingConfig(num_workers=2), ) result = trainer.fit() ``` We have many examples in ray-project#37123 on how this looks like in actual code.
harborn
pushed a commit
to harborn/ray
that referenced
this pull request
Aug 17, 2023
This PR migrates all the train and tune examples and docstrings to the new API convention, see https://github.com/ray-project/enhancements/ Continuation of ray-project#36706 and ray-project#37906 Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
harborn
pushed a commit
to harborn/ray
that referenced
this pull request
Aug 17, 2023
arvind-chandra
pushed a commit
to lmco/ray
that referenced
this pull request
Aug 31, 2023
This PR removes some circularities in the Ray AIR import system so we can put the training related functions into `ray.train`. It introduces a training context and makes report, get_dataset_shard, Checkpoint, Result, and the following configs: - CheckpointConfig - DataConfig - FailureConfig - RunConfig - ScalingConfig available in `ray.train`. No user facing changes yet, the old APIs still work. Going forward, it will be most consistent / symmetrical if these things are included in the following way: ```python from ray import train, tune, serve # Pick the subset that is needed # Include what you need from the following: from ray.train import CheckpointConfig, DataConfig, FailureConfig, RunConfig, ScalingConfig # ... def train_func(): dataset_shard = train.get_dataset_shard("train") world_size = train.get_context().get_world_size() # ... train.report(...) trainer = train.torch.TorchTrainer( train_func, scaling_config=ScalingConfig(num_workers=2), ) result = trainer.fit() ``` We have many examples in ray-project#37123 on how this looks like in actual code. Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
arvind-chandra
pushed a commit
to lmco/ray
that referenced
this pull request
Aug 31, 2023
This PR migrates all the train and tune examples and docstrings to the new API convention, see https://github.com/ray-project/enhancements/ Continuation of ray-project#36706 and ray-project#37906 Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
arvind-chandra
pushed a commit
to lmco/ray
that referenced
this pull request
Aug 31, 2023
Continuation of ray-project#37123 Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
vymao
pushed a commit
to vymao/ray
that referenced
this pull request
Oct 11, 2023
This PR migrates all the train and tune examples and docstrings to the new API convention, see https://github.com/ray-project/enhancements/ Continuation of ray-project#36706 and ray-project#37906 Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Signed-off-by: Victor <vctr.y.m@example.com>
vymao
pushed a commit
to vymao/ray
that referenced
this pull request
Oct 11, 2023
Continuation of ray-project#37123 Signed-off-by: Victor <vctr.y.m@example.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
This PR migrates all the train and tune examples and docstrings to the new API convention, see https://github.com/ray-project/enhancements/
Continuation of #36706 and #37906
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.